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Abstract: We introduce the branching transitive closure operator on 
weighted monadic second-order logic formulas where the branching cor- 
responds in a natural way to the branching inherent in trees. For ar- 
bitrary commutative semirings, we prove that weighted monadic second 
order logics on trees is equivalent to the definability by formulas which 
start with one of the following operators: (i) a branching transitive closure 
or (ii) an existential second-order quantifier followed by one universal first- 
order quantifier; in both cases the operator is applied to step-formulas over 
(a) Boolean first-order logic enriched by modulo counting or (b) Boolean 
monadic-second order logic. 

ACM classification: F.l.l, F.4.1, F.4.3. 
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1. Introduction 

In [BM92] monadic second order logic (MSO) for strings is characterized by the exten- 
sion of first-order (FO) logic with unary transitive closure (FO + TC [l1 ). In jBGMZlO) 
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and TAMOP-4.2.1/B-09/1/KONV-2010-0005 of the Hungarian National Development Agency. 



1 




Thm.10] weighted restricted MSO for strings is characterized by the application of 
a (progressing) unary transitive closure operator to step formulas over FO formulas 
extended by modulo counting. For trees such a characterization of MSO in terms of 
transitive closure existed neither for the weighted nor for the unweighted case. In 
|tCS08| is was proved that MSO on trees is strictly more powerful than FO + TC' 1 '. 
Moreover, MSO is strictly less powerful than \J k>1 FO + TC [fcl where TC [fcl denotes the 
transitive closure of some binary relation over the set of fc-tuples of positions in trees 

TK09]; even FO + TC^ 2 ' contains a tree language which is not definable in MSO (cf. 

TK09, Prop. 4]). This raises the following question: is there a version of transitive 
closure for trees which characterizes MSO in the unweighted and the weighted case? 

In this paper we define the concept of branching transitive closure for trees ('i/'-TC),^] 
and we characterize MSO on trees by ip-TC applied to FO extended by modulo count- 
ing. Let us informally explain the concept of tp-TC by first recalling how TC' 1 ' works. 
The operator TC^ 1 ' is applied to a formula (p(x, y) with two free variables x and y, 
called input and output variable, respectively. Then TC^\ip) is interpreted as the 
transitive closure of the binary relation induced by ip. If TC' 1 '^) is interpreted on 
a tree, then a sequence of positions is chosen; these positions might be thought of as 
intermediate points of a tree- walk (cf. Fig. [ija)). Contrary to this scenario, the opera- 
tor ip-TC x is applied to a finite family $ = (ipk( x i Hi, ■ • • , Vk) I < k < m) of formulas 
where (pk(x, y\, . . . , yu) has the free input variable x and the free output variables 
yi,...,yk- Then ip-TC x (§) is interpreted on a tree as follows (cf. Fig. [IJb)). The 
interpretation starts by choosing an arbitrary position v as assignment for x. Then, 
the operator chooses a branching degree k and thereby the formula ifk(x,yi, . . . ,yk)- 



the role of tp will become clear later 
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To each y^ it assigns a position Wj such that ipk( v > v i> ■ ■ • j u fe) holds. This process is 
iterated where the output positions of an iteration step become the input positions for 
the next step. Finally, the formula ipo(x) has to be chosen which finishes the iteration. 
Hence, ip-TC x ($) reflects in a natural way the branching structure of trees. 

In our paper we characterize MSO by formulas of the form ip-TC(£) where £ is 
a class of FO formulas extended by modulo counting (similar to BGMZ10J). Let us 
explain how an MSO formula is transformed into formulas of ip-TC(C). For this we 
represent an arbitrary MSO formula by a finite state tree automata A |TW68l rDon70| . 
Then we use the idea of [Tho82 of splitting the given input tree into slices, where the 
number n of states completely determines the shape and the number of the slices. 
However, due to the branching inherent in trees, the appropriate definition of a slice 
was a technical challenge (cf. Section 6.1 ). For instance, in Fig. [2] for n = 3 the input 
tree £ is splitted into the shown slices Ci, ■ • ■ , C6- 

The state behaviour of A on £ induces a state behaviour on the slices of £. Then, due 
to the idea invented in |Tho82| . the state in which the evaluation of a slice starts can 
be represented by a position of this slices. The retrieval of the state from a position 
uses the modulo counting technique. The state behaviour on the slices is handled 
by assigning the representing positions to the free variables of the instances of the 
tfk -formulas. 

To understand the idea, let us consider Fig. [2] Let us assume that A has the state 
set {0, 1, 2} and that the evaluation of the slices Ci> • • • > Ce are started in states 0, 1, 1, 
2, 0, and 1, respectively. Then, e.g., the state in the slice d is represented by position 
e, the state 1 in £2 by 2111, and the state 2 in (4 by 32111. Hence, the state behaviour 
on d is handled by ipi(x, y\, 1J2) under the assignment x i-> e, y\ i-> 2111, y2 h-> 32111. 

This way of using the transitive closure operator shows that it suffices to 
choose the assignments to the free variables in a progressing manner down- 
wards the tree. In general, for the formula (fk(x, Z/i> • • • > Vk) and the assignment 
x H> v, y\ 1 y V\, . . . , yk H> Vk, the positions Vi, . . . , Vk are descendants of v. Actu- 
ally, we describe explicitly by means of a progress formula tp{x,yi) how the progress 
from v to Vi looks like. This shows the meaning of the parameter ip in the closure 
operator ^-TC. 

For the other inclusion ■0-TC(£) C MSO we first simulate every formula of ■0-TC(£) 
by a formula of the form 3X.Wx.ip where if € C (as it was done in jBGMZlOj for strings), 
and then observe that 3X.\/x.(p is in MSO. 

In fact, we prove our characterization result in a more general setting, viz. for 
weighted MSO logics over semirings |DV06j . There, the expression £ \= ip does not have 
a Boolean value, indicating whether £ is a model of ip or not; rather, this expression 
takes a value in some given semiring |Gol99j . The progress formula ip guarantees 
that no infinite summations occur in the definition of the semantics of the operator 
tjj-TC. If the Boolean semiring (with disjunction and conjunction) is employed, then 
the classical, unweighted case is reobtained. The weighted MSO logics over semirings 
(on strings) was introduced in |DG05[ IUU07l IDG09j . 
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Figure 2: An example of slices, representation of states by positions, and formulas 
which handle the state behaviour of A. 
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In our main result (Theorem 4.1 1 we generalize [BGMZ10, Thm. 10] to recognizable 
weighted tree languages, but only for commutative semirings. We prove that the 
expressive power of the logics 

(i) weighted RMSO (ii) B-TC(£ stcp ), (hi) dcsc + -TC(£ stcp ), and (hi) 3V(£ stcp ), 

are equivalent, where RMSO stands for restricted MSO, C E {BFO+mod, BMSO} and 
BFO and BMSO are the Boolean fragments of FO and MSO, resp., and mod allows 
modulo counting. Moreover, 

• the logic £ s tc P contains all £-step formulas (cf. [BGMZ10, Equ. (1)]), 

• desc+ refers to the particular progress formula desc+(x, yt), which holds for an 
assignment x t— > v, yi i— > Vi only if i>j is a descendant of v, 

• the operator B-TC is a bounded version of desc+-TC in which the distance 
between v and Vi is bounded by a natural number, and 

• the logic 3V(£ s t cp ) contains the set of formulas of the form 3X.Vx.ip where 

<P € £step- 

The handling of the weights is done in the same way as in BGMZ10 , from which 
we borrow several notations and notions; also we follow their lines of argumentation. 
However, the switch from strings to trees created two technical difficulties: (1) the 
appropriate splitting of an input tree into slices and (2) the unique representation of 
states in a slice in order to avoid counting a state behaviour too often. We employed 
the BFO-formulas form-cut and on-lmp, respectively, for handling these difficulties (cf. 



Section 6.2 1. Moreover, we use [Mal06, Prop. 18] (cf. Lemma 6.4) for the fact that 
the state-value behaviour of a weighted tree automaton on an input tree £ induces a 
state- value behaviour on the slices of £. This needs the commutativity of the semiring 
multiplication. 

In Section 2 we recall general notations on trees, the definitions of weighted tree 
automata and (fragments of) weighted MSO. In Section 3 we introduce our branching 
transitive closure operator and illustrate it by means of an example. Section 4 shows 



the main result of this paper (cf. Theorem 4.1 ); its proof uses results which are proved 



in Sections 5, 6, and 7. We conclude in Section 8 by indicating some open problems. 

We will use a number of macros in weighted MSO, and we introduce them at the 
places where they are needed first time. For the convenience of the reader we have 
collected all the macros in alphabetic order in an appendix. 



2. Preliminaries 
2.1. General Notation 

Let N denote the set {0, 1,2,.. .} of natural numbers; let N+ = N\{0}. The cardinality 
of a set A is denoted by \A\. Frequently we we abbreviate a tuple (aj, . . . , a k ) e A k 
by ai . . . afc. Note that A° = {( )}, and we abbreviate ( ) by e. 
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As usual, an alphabet is a non-empty and finite set. The set of strings over an 
alphabet A is denoted by A*. The empty string is denoted by e and the length of 
w G A* is denoted by 

2.2. Trees 

We use the usual notions and notations concerning trees, cf., e.g., |FV09| . For a 
ranked alphabet E we denote by E^ fc ) the set of symbols of E having rank k and by 
maxrk(E) the maximal rank of symbols in E. The set of E-trees indexed by some set 
A is denoted by Ts(A). In case A = we write T s for T S (A). Given a tree £ <E T B (A), 
we denote the set of its positions by pos(£) C (using the usual Gorn- notation). 
We abbreviate a sequence vi,...,Vk of positions by v\^- The lexicographic ordering 
on pos(£) is denoted by <. For every £ G Tz(A) and w G pos(£), we denote the label 
of £ at w by £(w) and the subtree of £ at w by £\ w . For any set B C A U E, we denote 
by pos B (£) the set {w G pos(£) | f(tu) G S}. If, additionally, C G T E (A), then 
denotes the tree obtained from £ by replacing the subtree at w by C- 

If A is finite, then we can define the ranked alphabet (E U A, rk) by rk(a) = for 
every a e A and rk(er) = rk E (er) for every a G S. 

We define the height height(£) and the size size(£) of a tree £ G T^(A) recursively as 
follows. For every a G A U E(°), let height(a) = and size(a) = 1, and for every a G 
E« with k > 1 and ... G T E , let height(<r(£i, . . .,&)) - 1 + max{height(&) | 
1 < i < k} and size(cr(^, . . . ,&)) = 1 + £\ =1 size(&). 

A ranked alphabet E monadic if E = EW U E<°) and E^ = {e} is a singleton. For 
such a E there is an obvious bijection from T E to (E^ 1 ^)* which transforms monadic 
trees into strings. 

2.3. Weighted Tree Languages and Weighted Tree Automata 

A commutative semiring is an algebra (S, +, •, 0, 1) where (5, +,0) and (S 1 , -,1) are 
commutative monoids, • distributes over +, and is absorbing with respect to •. As 
usual, we abbreviate (S, +, •, 0, 1) by S. 

In this paper, S will always denote an arbitrary commutative semiring. 

For more details on semirings we refer the reader to [HW98, Gol99 . A weighted tree 
language is a mapping r : — > S for some ranked alphabet E. In particular, for every 
tree language L C T E , we denote by 1^ the weighted tree language t L : T s — > S with 
= 1 for every £ G L, and otherwise. We call 1^ the characteristic weighted tree 
language of L. A recognizable step function [DG05, DG07, DG09J is a weighted tree 
language r : T E — > S such that there are n > 0, recognizable tree languages L\, . . . , L n 
over E [GS84, GS97], and coefficients a\, . . . ,a n in S such that r — Y^i=\ a i ' ^-Li- 

We recall the concepts of weighted tree automata from |FV09j . A weighted tree 
automaton over S (wta) is a tuple A — (Q, T,,d,F) where Q is a finite, nonempty set 
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(of states), E is a ranked alphabet (of input symbols), F C Q is a set of final states, 
and <5 is a family | fc £ N) of weighted transitions with S k : Q k x E^ fe ^ x Q — > £ for 
every fc € N. 

Let k £ N and q 1: . . . ,q k £ Q. The mapping h qi "- qk : T s (Z k ) -t S Q is recursively 
defined for every q £ Q and 

• for every Zj £ Z k by h qi '" qk {zi) q — 1 q — qi, and otherwise, and 

• for every a £ E^ with Z > and £i, . . . , £/ e T E (Z fe ) by 

pi---pieQ' i=1 

We abbreviate /i e by /i. The weighted tree language recognized by .4, denoted also 
by A, is the mapping r_4 : Ts — > S defined for every £ £ Ts by 

A weighted tree language r : — > 5 is recognizable if there exists a wta 4 such that 
r.4 = r. 

We note that a wta over a monadic ranked alphabet of input symbols is equivalent 
to an initial state normalized weighted automaton (as, e.g., used in [BGMZ10J). For 
a more detailed discussion about this special case we refer to |FV09( p. 324]. 

We call a wta A = (Q, E, S, F) final state normalized if \F\ = 1. 

Lemma 2.1 !FV0^ Thm.3.6] For every wta there is an equivalent wta which is final 
state normalized. 

2.4. Weighted Logics 

The Basic Weighted MSO- Logic Here we recall the weighted MSO-logic on trees 
which we will use in this paper. This weighted logic has its origin in DG05 ( DG07, 
IDG09] where it was defined for strings. It has been extended to trees in |DV06|. IFV09| 
IDVlOj . We present it in the form of jBGMZlOj . 

As usual in MSO-logic, we use first-order variables, like x,x\,xi, . . . ,y,z and second- 
order variables, like X, X\ , X2, . . . ,Y,Z. 

We define the set of weighted MSO-logic formulas over E and S, denoted by 
MSO(S, S) (or shortly: MSO), to be the set of formulas generated by the following 
EBNF with nonterminal ip: 

(p ::= a | label ff (x) | edge^x, y) \ x -< y \ x € X \ 

-nip I ip v p I <p A p I Bx.ip I \fx.p I 3X.p I \/X.ip, 
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where a £ S, x,y are first-order variables, a £ E, 1 < i < maxrk(E), and X is a 
second-order variable. We will abbreviate a sequence . . . Bx^ of quantifications by 
Bxi^. The set of free variables of a formula p is denoted by Free(<p). The formula 
ip is called sentence if Free(p>) = 0. Often we indicate the free variables of a formula 
explicitly. For instance, if a formula tp has the free variables x, y, and z, then we 
denote this fact by <p(x, y, z). If x\, . . . , x% are the free variables of some formula ip, 
then we write ij){x\ k), and accordingly for other sequences of variables. 

As usual in logics, we deal with free variables of a formula by means of variable 
assignments. Rather than repeating all the technical details here we only collect the 
most important notations and refer to the reader to |DG051 IDG071 IDG091 [DV061 IFV091 
IrTVlOl IBGMZTU] for details. 

Let ^ € Ij. For a finite set V of first-order and second-order variables we denote a 
V -assignment for £ by p. For any position w € pos(£) and set W C pos(£), we denote 
the x- and A-update of p by p[x H ► w] and p[X H > W], respectively. 

In the usual way, we can encode a pair where p is a V-assignment for £, as 

a tree £ over the ranked alphabet Ey with E^ fc) = E< fe ) x V(V) for every fc e N. A 
tree £ £ T^, v is called valid if for every first-order variable x £ V there is a unique 
to G pos(C) such that x occurs in the second component of £(u>). We denote the set of 
all valid trees in Ts v by T^ v . 

Let <p £ MSO and V be a finite set of variables containing Free(i^). The semantics 
of ip is the weighted tree language \<p\v ■ ?s v — > S 1 defined as follows. If £ £ T^ v is 
not valid, then we put [y]v(0 = 0- Otherwise, we define [y]v(0 € S inductively as 
follows where (£, p) corresponds to £. 

Hv(C) = « fo> V V]v(C) = Mv(() + Mv(0 

[label CT (x)] v (C) = { J o t 2Se =<T ' l*> Atf]v(C) = Mv(C) • Mv(C) 

[edg Et (x,y)]v(C) = { J othlrw^/^' Hv(0=E MvwKHw]) 
I* * ^ = { I o£iie PiV) = £ Mvuw(C[* - 1) 

l. /Cpos(C) 

= n i 

liiGpos(C) 



[*e.x]v(C) = { J ot£ifse P(X) ' iv^lv(0=n Mvuw(C[^^H) 

I ^Gpos(C) 



/Cpos(C) 

The order of the factors in the product over pos(£) is arbitrary because S is a com- 
mutative semiring. Let tp be a formula with free variables aci, . . . , x n , £ £ Ts, and p 
a Free(<p)-assignment for £ such that p{x{) — U{ for 1 < i < n. Then we denote the 
semiring element [</>](£) by [y>](£, Ui, . . . , u n ), where £ = (£, p). 

We abbreviate [v?]Frcc(» by \tp\- We say that two formulas tp and ^ with the same 
set of free variables are equivalent, and write tp = ^, if [y] = [?/>]. For any £ C MSO, 
a weighted tree language r : Ts — > S is called C-definable if there is a sentence tp £ C 
such that r = ip} . 
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A formula tp is called Boolean-valued if {[^]v(0 I C € ^S v } — {0)1} f° r every 

V containing Pree(y). If [</?]](£, u%, . . . , u n ) = 1 for some £ 6 T^, and u l7 ...,u„ € 
pos(£), then we abbreviate this fact by writing that u p^(ui, . . . , u n ) holds" or "we 
have tpt(ui, . . . ,u n ) v or just "tp^(ui, . . . ,u n ) v . 

For every tp, tp £ MSO, we define the macro tp tp := —193 V (tphtp). Then, for every 

V containing Free(</j) U Free(^>) and C G ?s v , we have [ip — > V]v(C) = if C is not 
valid. If £ is valid and tp is Boolean-valued, then we have that 



Clearly, if p and tp are Boolean-valued, then tp — > ?/; is Boolean-valued. 

The Boolean Fragment BMSO: Next we define the Boolean fragment of MSO ac- 
cording to jBGMZlOj . The Boolean fragment o/MSO, denoted by BMSO, is the set 
of all formulas generated by the EBNF 

tp ::= | 1 I label CT (x) | edge^x, y) | x X y \ x £ X \ -up \ tp A tp \ Mx.tp \ \/X.tp . 

Clearly, every tp £ BMSO is Boolean-valued. 

In BMSO we define the following macros: for every tp, tp £ BMSO we let 



Let tp be a BMSO-formula. Then the semantics of tp in S coincides with that in 
B (cf. |BGMZ10] ). Moreover, tp can be considered as a classical (unweighted) MSO- 
formula for trees, and we can show easily that the semantics of tp in B is 1l, where L 
is the recognizable tree language defined by tp (as classical MSO-formula). By Lemma 
3.3(1) of UDV06 , the weighted tree language 1^ is recognizable. Hence, we obtain the 
following. 

Observation 2.2 The semantics of any UMBO -formula is a recognizable weighted tree 
language. 

£-step Formulas: Let C C BMSO be closed under A and -1. According to |BGMZ10] . 

the set of C-step formulas, denoted by £ s tepj is the set of all MSO-formulas generated 
by the EBNF 



We will use the following technical result. 

Lemma 2.3 For every £-step formula tp, there are k E N+, ai,...,ak £ S, and 
tpi, . . . ,tpk £ C such that tp = \/i<i<fc( a i A pi). In particular, the semantics of p is a 
recognizable step function. 




• p\/tp := ~^(^p A ->ip), 

• 3x.p := -<yx.^p, 



• 3x.p := -Nx.—itp, 

• tp —± tp := -^pV_(p A tp). 



p ::= a I a I ->tp | tp V tp \ tp A p with a £ S and a £ C . 
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Proof The first statement can be proved by an easy adaptation of [BGMZ10, Lm. 
3]. The second statement follows from the first one, the definition of the semantics of 
MSO-formulas, and Observation |2.2| ■ 

3V(£)-Formulas: Let C C MSO. The fragment 3V(£) consists of all MSO-formulas 
of the form 

BX.Vx.p (1) 

for some formula tp in C. A weighted tree language r : T% S is 3V(£)- definable if 
there is a formula 3X^iy.tp{x,X,y) € 3V(£) where tp has the free variables x, X, and 

r(0 = l3X.Vy.tp(x,X, y )l(Z,e) 

for every ^ e Tj. 

The Fragment RMSO: We define the fragment RMSO of restricted MSO in the spirit 
of |GaslOllBT2MZTo] . 

Formally, the fragment RMSO is the set of all weighted restricted MSO-formulas 
generated by the EBNF: 

tp ::= a | \abe\ cr (x) | edge^x, y) \ x -< y \ x e X \ 

-it/} | tp V tp | tp A tp | 3x.tp | Vx.^ | 3X.tp | VX.x, 

where -0 is a BMSO-step formula and \ is a BMSO-formula. 

The Fragments of First-Order Logic FO and BFO: Another fragment of MSO is 
the set of weighted first-order formulas over E and S, denoted by FO, which is the set 
of all formulas generated by the EBNF 

tp ::= a \ label CT (x) | edge^x, y) \ x ■< y \ x € X \ -up \ tp V tp \ tp A tp \ 3x.tp \ Vx.tp . 

Note that second-order variables may occur (as free variables). 

The fragment BFO is defined to be the intersection BMSO n FO. That is, BFO is 
the set of all formulas generated by the EBNF 

tp ::= | 1 | label CT (a;) | edge^x, y) \ x ■< y \ x e X \ ^tp \ tp A tp | Vx.tp . 

The Fragment using Modulo Constraints BFO+mod: Let n e N + and m e N such 
that < m < n. We introduce the macro |x| =„ m with the only free variable x, and 
its intended meaning is as follows. For every tree £ G Xs and v € pos(£) we have 

Dd -.«!«,.)-{ j ^ e | r =:, (mod " ) w 
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We define |x| = n to by the following BMSO-formula: 
(|x| = n to) := 

VX (((x G X) A (Vy.((y eX)A (|y| > n)) 4 (y/n G X))) 4 (m G |X|)) 
where 

• (|y| > n) := 3x. desc„ + i(x, y), 

• desc„(x, y) := desc [ni „](x, y) for every n G N + , 

• desc^j (x, y) := V w6J (2/ = ™) 

for every t, j £ N with i < j where I = {ni . . . ni G (N+)* | i < I < j}, 

• (y = xw) := 3y ,„.(x = y ) A form-path^ (y 0: n) A (y n = y) 
for every w = W\ . . . w n G N* with G N, 

• (x = y) ■= (x ^ y) A (y r< a;), 

• form-path^ (y .„) := A edge w . (y;_x, yi), 

l<i<« 

(y 0) . . . , y„ form a path via tox • • ■ w„). 

• {y/n G X) := 3x. (x G X) A desc„(x, y), 
' edsefof) ^Yi^^^edge^y), 

• root(x) :— Vy.^ edge(y, x), 

• (m G |X|) := 3x, y. root(x) A desc m (x, y) A (y G X) . 

It is easy to see that ^ holds. 

Let us denote by BFO+mod the fragment of BMSO which we obtain by adding the 
formula |x| =„ to for every n G N+ and to G N to the list of the alternatives defining 
the fragment BFO. 

3. Branching Transitive Closure 

In this section we introduce our branching transitive closure operator ip-TC x with 
progress formula ip, and wc define its application to a family $ of formulas of the form 
tpk(x,yi, . . . , yk) with one free input variable and k free output variables. First let us 
deal with the possible forms of the progress formulas. 

As indicated in the introduction, it suffices to use progress formulas which restrict 
the choice of positions to a progress downwards the tree. Moreover, in order to avoid 
infinite summations in the semiring, we require that the progress formula is irreflexive. 
Formally, let i/>(x, y) be a formula in BFO with two free variables. We call ip(x,y) a 
progress formula if ip^(v,v') implies desc^(w,w') for every £ G Ts and v,v' G pos(£) 
where the BFO-formula desc+(x,y) is defined by 

• desc + (x, y) := desc(x, y) A ->(x = y), 
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desc(x, y) :— (x ^ y) A ^3z, z'.z". 



V n ^ ^ ^ W v> ( ed g e t ( z > z ') A ed S e i ( z > z ")) A 



(z' ^ x) A (x ^ z") A (z" ^ y) , 

(y is a direct descendant of x) 

• (a: -< y) := {x<y) A ^(x = y). 

Since the progress should be made at each of the free output variables of 
¥>fc(x, 2/1, ... , yk), we extend the progress formula tp to a formula V'fe which fits to ipk- 
Formally, for every k > 1, we define the macro 

ipk(x,Vl,k) ■= f\ i>{x,Vi) A /\ sibl + (y 4 ,y i+1 ), 
l<i<fc l<i<fe-l 

where 

• sibl+(x, y) := 3x', y'.(sibl(x', y') A desc(x', x) A desc(y', y)) 
' sib1 ^) : =^.y i < l<J < maxrk(E) edge l (z,x)Aedge J (z,y), 

(x is a younger sibling of y). 

The branching transitive closure operator is applied to a family of formulas. For an 
arbitrary C C MSO, by a family of formulas in C we mean a family 

$ = Ofe(x,yi )fe ) | < k < to), 

where to € N, fki, x iyi,k) € £, and <Pk{%,yi,k) has the free variables x,yi, . . . , y&. 

Now let ^(x, y) be a progress formula and $ be a family of formulas in C We define 
the family (<ip-TC l x ($>) | I > 1) of MSO-formulas by induction as follows: 

(i) ^-TC&(*) = M*) 

(ii) V-TCi +1 (<f>) = 

V 3y 1 , fe .V'fe(x,yi, fe ) A<pk(x,Vi,k) A V A ^-TC{*($) 

l<k<m /i,...,/ fe SN + l<i<fc 

*l + ... + Zfc=Z 

Notice that the upper index I in TC' denotes the level of the iteration, while in TC^' 
it denotes the dimension of the relation of which the transitive closure is considered 
(cf. Section [TJ. 

The iji -branching transitive closure of $ (on x) is the formula ip-TCx^) which has 
the following semantics. For every £ € T% and v G pos(£), we have 

[^-TC«(*)](e,i;) = [ v i>-Tcu$)m,v). 

l<J<size({) 

We note that it suffices to let I range over the finite set {1, . . . , size(£)}, because the 
progress formula if>(x,y) is irreflexive, and hence we have [^-TCj($)](^,d) = for 
every £ g and u G pos(£), provided that Z > size(£). 
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Moreover, we define ip-TC(C) to be the class of formulas of the form ip-TC x (&), 
where $ is a family of formulas in C. Finally, a weighted tree language r : T% — > S is 
ip-TC(£)-deftnable if there is a ^-TC(£)-forrnula ^-TC X ($) such that for every £ € T s : 

r(0 = [^-TC x (*)](e,e). 

In this paper we will deal with two particular progress formulas: desc+(x,?/) and 
desc[i !n ] (x, y). We call the desc[i jn ]-branching transitive closure also n-bounded branch- 
ing transitive closure. We abbreviate U«eN + desc[i )n ]-TC(£) by B-TC(£) and we define 
3-TC(£)-definable weighted tree languages in the obvious way. 

Since desch in i implies desc+, the n-bounded branching transitive closure can be 
expressed by the desc + -branching transitive closure. More precisely, for every family 
$ = (<pk(x, Vi,k) | < k < m) of formulas in L and n £ N+, we have that 

[desc [1)n] -TC,(*)] = [descf-TG,^)] 

where $' = (descn „i fc(x, yi,fc)A^fc(a;, yi,k) | < fc < m). Thus the following statement 
holds. 

Observation 3.1 Let C be any subclass of MSO which is closed under conjunc- 
tion. If desc[i jn ] € C, then every B-TC(£)-definable weighted tree language is also 
desc + -TC(£)-definable. In particular, the statement holds for C = BMSO and 
C = (BFO+mod) stcp . 

Example 3.2 Let S = {cr( 2 ),a( )} and A = {a (3) , 6 (2) , a (0) } be two ranked alphabets. 
We define the tree homomorphism h : Ta — > Is by 

• h(a) = a(a(zi,z 2 ),z 3 ), 

• h(b) = a{zx, Z2), and 

• h(a) = a. 

(For the definition of tree homomorphism cf. [GS97, p. 18].) For the commutative 
semiring of natural numbers (N, +, -,0, 1), we define the weighted tree language r : 
T s -> N for every £ e T s by 

r(0 = l^ 1 (OI. 

that is, r(£) is the number of decompositions according to h. For instance, for 
£1 = a (a (a (a, a), a), a) we have 

r(£i) = \{b(b(b(a, a), a), a), b{a{a,a,a),a), a(b(a,a),a,a)}\ = 3, 

and for £2 = cr(a,a(a,a)) we have ^(£2) = 1. 

Our goal is to define a family $ = (<pk{%, yi.k) | < fc < 3) such that 
r(0 = [desc [li2 ]-TC x (*)](£,e). For this we let 

• ifo(x) := label Q (x), 

• ipi(x,yi) := false, 

• ^2(^,2/1,2) := labels (x) A edge-^x, yi) A edge 2 (x,y 2 ) and 
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• ipa(x, 1/1,3) := label (T (x) A 3xi.edge 1 (a;,xi) A label CT (xi)A 
edge! (xi, 2/1 ) A edge 2 (xi, 2/2) A edge 2 (x, 2/3). 

Then, for every ^ £ we have that 

if I $ sizefTT 1 ^)), then [descp. i3] -TC£($)](f , e) = . (3) 
In particular, for our given tree £1, we have that size(/i (£1)) = {6, 7}, and we obtain: 

[desc^-TC^)]^) = £ [deBc [ll2 ]-T(£(*)](£i, e) 

;esizc(/i- 1 (c 1 )) 

= [desc [ i, 2] -TC«($)](ei,£) + [desc [ i, 2] -TCl($)](ei, £ ) . 

We can further calculate: 

[deBC [1)2] -TC»($)](e ll e) = ai + a a 

where 

ai = p2/i,3-desc [li2 ] i3 (x, 2/1,3) A ip 3 (x, 2/1, 3 )A 

V (des C[ i, 2] -TC^($) A desc [lia] -TC£($) A desc [ll2] -TC& (*))](&,£) 

il+i2+i 3 =5 

and 

a-2 = p2/i,2-desc[i i2 ], 2 (x, 2/1,2) A ^2 (a;, 2/1,2) A 

V (de S c [li2] -TC^($)Adesc [l!2] -TC; 2 2 ($))](ei,e) . 

ii+i 2 =5 

Let us consider ai in more detail. Obviously, since x is assigned to e, the formula 
<p 3 (x, 2/1,3) uniquely determines the positions 11, 12, and 2 which have to be assigned 
to 2/1 1 2/2 j and 2/3 1 resp., in order to obtain values different from 0. Thus: 

ai = [ V desc [ i, 2] -TC; i i ($)Adesc [li2] -TC; 2 2 ($)A 

h+h+h=5 deM [1)2] -TCj» (*)](&, 11, 12,2) 

= [desc^-TC^ (*)](&, 11) • [de SC[ i, 2] -TC^($)](ei,12)- 

[des C[1 , 2] -TCi 3 ($)](a,2) 

due to ([3]). Then, for instance, 

[des C[1 , 2] -TC; 2 ($)](6,12) = bo (2/2)] (£1, 12) = pabel Q (j/ 2 )](^, 12) = 1 . 
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In a similar way we can calculate the other occurrences of [descr 1)2 ]-TCj,(<I>)](£i, w) 
and obtain eventually that a± = a 2 = 1- Moreover, 

[desc[i i2 ]-TC£($)](£i,£) = l3y h 2.desc [lt2 ], 2 (x, 2/1,2) A (p 2 (x, 2/i, 2 )A 

V (de 8C[1 , 2] -TC^($) A desc [li2] -TC^ 2 (<i>))](ei,e) = 1 

h,h 

h+h=6 

and in total [desc^-TC^)]^ e) = 3 - r(&) . 

Since $ is a family of FO-formulas, the weighted tree language r is desC[i j2 ]-TC(FO)- 
definable. 



4. The Main Result 

Theorem 4.1 Let S be an arbitrary commutative semiring and r : Ts — > S a weighted 
tree language. Then the following are equivalent: 

(a) r is RMSO -definable. 

(b) r is recognizable, 

(c) r is B-TC((BFO+mod) st0 p)-de/ma&Ze, 

(d) r is B-TC(BMSO step )-de./mabZe, 

(e) r is desc+-TC((BFO+mod) s tep)-rfe/ma&Ze, 

(f) r is desc+-TC(BMSO stcp ) -definable, 

(g) r is 3V((BFO+mod) st op)-rfe/ma&Ze, 

(h) r is 3y(BMSO s tcp) -definable, 



PROOF By Theorem 5.1 (a) and (b) are equivalent. Since BFO + mod C BMSO, 



we have that (c) implies (d), (e) implies (f), and (g) implies (h). Theorem |6.1| and 



Observation 6.5 prove that (b) implies (c). By Observation 3.1 we obtain that (c 



implies (e) and (d) implies (f). By Corollary 7.2 (e) implies (g) and (f) implies (h) 



Since 3V(BMSO step ) C RMSO, also (h) implies (a). a 

As a corollary of our main result, we obtain a characterization of recognizable tree 
languages in terms of our branching transitive closure operator. Let us denote by 
MSOt(S) (or shortly by MSOt) the set of (unweighted) monadic second order formulas 
for trees over E (cf. |Don70[ [TW68| ) and by FOt its first order segment. 

Corollary 4.2 Let L C Tj; be an arbitrary tree language. Then the following are 
equivalent: 

(a) L is MSOt -definable. 

(b) L is recognizable, 
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(c) L is B-TC(FO t +mod) -definable, 

(d) L is B-TC(MSO t ) -definable, 

(e) L is desc + -TC(FO t +mod)-<ie/ma&Ze, 

(f) L is desc+-TC(MSO t )-definable, 

(g) L is 3V(FOf+mod)-de/ma&/e, 

(h) L is aV(MSOt) -definable, 



Proof (Sketch.) Since Theorem 4.1 holds for the Boolean semiring B (with operations 
disjunction and conjunction), it suffices to prove the following statement (f): for every 
L C and x € {a, . . . , h}, statement (x) holds for L if and only if statement (x) of 
Theorem gljholds for S = B and r = t L . 

To prove case x = a, first we observe that the logics RMSO(£,B) and MSO(£,B) 
are equivalent. Moreover, each MSOf(S)-formula can be considered as an MSO(£,B)- 
formula with the same semantics. Vice versa, every MSO(£, B)-formula can be 
transformed into an equivalent MSOt(S)-formula by writing, e.g., 3x. (labels (x) A 
-ilabel CT (a;)) for and Va;.(label cr (a;) V ^ labels (a;)) for 1 for some a G S. Hence (f) 
holds in this case. 

For the proof of x = b, see |FV091 Subsect. 3.2]. 

To prove case x = c, we observe that (BFO+mod) s t ep (£,B) and FO+mod(S,B) are 
equivalent. Moreover, FOt+mod(E)-formulas and FO+mod(£, B)-formulas correspond 
to each other in the natural way described above for MSO t (E) and MSO(E,B). The 
proof of the other cases is similar. 



5. RMSO-definability is Equivalent to Recognizability 

We have defined the fragment RMSO of restricted MSO in the spirit of [Gas 10] (and of 
BGMZ10 ). It is, however, syntactically slightly different from the fragment with the 
same name introduced in [DG051 |DGQ7j and used in DVClB"! IDV11] . In the restricted 
MSO-fragment of |DG05[ IDG07] . (cf. e.g. |DV06[ Def. 4.1 and 4.8]) x < y is not 
an atomic formula, negation is only applicable to atomic formulas except coefficients 
from S, and second-order universal quantification is not allowed. Henceforth we will 
call this fragment RMSO' (cf. [DV061 Def. 4.8]). In the RMSO of the present paper, 
x ^ y is an atomic formula, negation can be applied to BMSO-step formulas, and 
second-order universal quantification can be applied to BMSO formulas. However, 
semantically both fragments are equivalent. We prove this by showing that a weighted 
tree language r is recognizable if, and only if it is RMSO-definable, and by using the 
fact that this equivalence was proved for RMSO'-definability in |DV06[ Thm. 5.1]. 

Theorem 5.1 Let r : Ts — > S be a weighted tree language. Then r is recognizable if, 
and only if it is RMSO -definable. 
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Proof By DV06, Thm. 5.1], a weighted tree language is recognizable if, and only if 
it is definable in RMSO'. In the proof of the present theorem we indicate that we can 
replace RMSO' by our RMSO. 

First we show that every RMSO-definable weighted tree language is recognizable. If 
the formulas have the form a, label CT (a;), edge^x, y), x € X, tpAip, <fVip, Bx.ip, or 3X.ip, 
then they are also in RMSO', hence by |DV061 Lm. 5.2-5.4] they are recognizable. For 



the formula x -< y we apply Observation 2.2 Next we consider a formula ip of the 



form -iip where ip is a BMSO-step formula. Then if is also a BMSO-step formula, 
and by Lemma |2.3| its semantics is a recognizable step function. Using the fact that 
recognizable weighted tree languages are closed under scalar product and summation 
(cf. |DV06[ Lm. 3.3 of]), we obtain that the semantics of if. is recognizable. Next let ip 



be of the form Vx.ip where ip is a BMSO-step formula. By Lemma 2.3 the semantics 
of ip is a recognizable step function and thus by [DV061 Lm. 5.5] the semantics of 
ip is recognizable. Finally let ip be of the form \/X.\ where \ is a BMSO-formula. 
Then also if is a BMSO-formula of which the semantics is a recognizable weighted tree 
language again due to Observation |2.2| Thus, summing up, every RMSO-definable 
weighted tree language is recognizable. 

Second we show that every recognizable weighted tree language is RMSO-definable. 
In |DV061 Thm. 5.11] it was proved that every recognizable weighted tree language 
is definable by some RMSO'-formula. Thus it suffices to show that every weighted 
tree language which is definable by some RMSO'-formula is also definable by some 
RMSO-formula. We observe that all RMSO'-formula are also RMSO-formulas except 
the RMSO'-formula if — Vx.ip, where [i/j] is a recognizable step function. However, by 
the definition of the recognizable step function, it should be clear that in this case ip is 
equivalent to a BMSO-step formula and hence ip is equivalent to an RMSO-formula. ■ 



6. From wta To Bounded Branching Transitive Closure 

In this section we will simulate the behaviour of a wta by the bounded branching 
transitive closure of a particular family of formulas. For this let A = (Q, S, 6, F) be a 
wta with Q — {0, . . . , n— 1} and n 6 N + . By Lemma 2.1 we can assume that F — {0}. 
Our goal is to construct a family = (cpk(x, j/i,fc) | < k < max(S,n)) of formulas 
in (BFO+mod) s tcp such that the following theorem holds. 

Theorem 6.1 For every £ G T s , we have that r^(0 = [desC[i, 2n ]-TC 2; ($^)](£, e). 



The main idea for the construction of $.4 and the inductive proof of Theorem 6.1 
(cf. Statement 1 on page 26 ) is due to |Tho82j . For this we decompose an input tree 
£ into slices (cf. Section 6.1 1. Then the behaviour of A on £ induces a behaviour 
on the slices of £ (cf. Lemma 6.4) which is simulated by desc^ 2 „]-TC K ($^i). More 
precisely, let us denote the topmost slice of the decomposition of £ at some position u 
by head„(£,M) and the positions of £ at which the slices below head„(£, u) start, by 



17 



U\ . . . life. Then we construct $^ such that the decomposition 

HZ\u) q = E ^■■■ 9 Mhcad„(e, U )V J] fe Kk)« (4) 
gi,...,gs,eQ l<i<fc 

of the behaviour of „4 is synchronized with one level of the iteration 

desc [li2n] -TCL +1 (^) = (5) 
\f 3y 1)fc .desc[i )2n ] jfc (a:,yi ) A,) A <pfc(sf,yi,k) A 

l<fc<max(S,Ti) 

V A desc[i,an]-TCj*(^) . 

ii,...J fc GN + l<i<fc 
/i + ...+i fc =i 

In Fig. [3] we visualize this synchronization for a wta .A with state set {0, 1, 2}. In 
part (a) we show the subexpression /i 20 (head3(£, u))i ■ h(£\ Ul )2 • h(£\u 2 )o of the right- 
hand side of Q with n = 3, k = 2, g = 1, qi = 2, and q2 = 0. In part (b) we have 
abbreviated desc [1)2n] -TC^.(^) by TC{*($^). 

We will represent the states of A by positions of £. Roughly speaking, the synchro- 
nization happens in the way that [desc^ ^nj.fc^j Vi,k) A ipk(x, v, i>i,fe) provides 
the value ft, 91 ' " 9fc (head„(^,u)) 9 , where u and are the positions of head„(£, w) and 
head„(£, iti), . . . , head„(£, Mfe) which encode g and gi, . . . , g%, respectively. Moreover, 
[desc [1>2tl] -TC^($^)](C,«t) provides ft.(fk)<w 

6.1. Decomposition of a Tree into Slices 

We represent slices as particular trees with variables. For this, we introduce the sets 
Z = {z\, Z2, Z3, . . .} and Zk = {zi, . . . , z k }, k € N of variables. Then we denote by 
Cs,fc the set of all trees £ G T^(Zk) such that each Zj € occurs exactly once in £ 
and the variables occur in the order z±, . . . , z^ from left to right. Note that C^.o = 
For every fc G N, let 

= {C G C s , fc I V W G pos(C) : (|iu| < 2n) A (CH G Z k -> |w| = n)}. 

We note that C£ = {£ G Ts | height(£) < 2n}. Moreover, it should be clear that 
there is a k (depending also on X) such that C£ k = for every k > k . We denote 
the smallest such fc by max(£, n). It is also clear that Cg i H Cgj = for every i 7^ j. 

The next observation is crucial when decomposing a tree into slices. 

Observation 6.2 For every £ G Ts and it G pos(£), there is a unique tefj and a 
unique sequence m,...,Uk G pos(£) such that 

• teMui-.-tsfcluJIu G Cg (fc and 

• height(£| Ui ) > n for every 1 < i < k. 
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We will denote the tree (^[z 1 ] tll . . . [z fe ]„ fc )| u by head„(£,u) and the sequence 
(til, • • • ) u k) by cut„(£,u). In particular, head„(£,it) = £| u and cut„(£, u) — (), i.e., 
k = 0, if and only if £| u £ Cg . We abbreviate head„(£, e) and cut„(£ , e) by head„(£) 
and cut ra (£), respectively. 

The tree head„(£,it) is the slice of £ at u and the positions ux,...,Uk are cut- 
positions for £ and u. By applying Observation |6.2| repeatedly, we obtain a unique 
decomposition of £ into slices (cf. Fig. [2]). Formally, we define the ranked alphabet 
Cg such that (Cg) (fe) = Cg k for every k > (recall that Cg is finite). Moreover, we 
define the mapping dec„ : Ts — > Tc™ inductively as follows. For every ( € Ts, let 

dec„(£) = head„(£)(dec„(£| Ul ), . . . , dec„(£|„j) , 

where cut„(£) = (m, . . . ,u k ). 

Observation 6.3 For every £ £ Tg, size(dec„(£)) = 1 if and only if height(£) < 2n. 

The following decomposition lemma will be crucial in the simulation of a wta by 
means of branching transitive closure (see Section [6]). We note that the lemma can be 
derived from pVIal06[ Prop. 18], which is proved for bottom-up tree series transducers, 
i.e. for a generalization of weighted tree automata. Recall that S is commutative. 

Lemma 6.4 Let £ £ T-g, q £ Q, and cut n (£) = (lii, . . . , Ufc). Then 

<3i,...,ijfcGQ l<i<fc 

Proof (Sketch.) We can prove the following, more general statement: for every 
k £ N, C e C s>fe , £ x , . . . , £ fc € T E , and q € Q, we have 

qi,...,qk£Q l<i<k 

where CKij ■ • ■ denotes the tree obtained by replacing every occurrence of Zi in £ 
by £i for 1 < i < k. Since the case k = is trivial, we may assume that k £ N+ and 
proceed by induction on the height of £. If height (£) — 0, then k = 1 and £ = z i, 
hence the statement holds again trivially. Now let height(£) > 0, i.e., £ = cr(£i, ■ • ■ , 0) 
for some I £ N + , a £ £/, and £i, . . . , Q £ T^(Zk). By standard arguments, there are 
ki, . . . , ki £ N and there are r\j £ C^.k, for 1 < j < I, such that k\ + . . . + ki = k and 

CKi, •••,&] = o-^iKi, ■ ■ ■ ,6:1], ■ • • ,m[€ki+-+ki- 1 +i, ■ ■ ■ ,£&])■ 

Now we can prove the statement by unfolding /i(£[£i, • • • , £fc])g and organizing the 
computation appropriately. In the first step we apply the weighted transition for a. 
Then the statement is proved for indexes j with kj = 0, while we apply the induction 
hypothesis on height (£) for indexes j with kj £ N + . I 



20 



6.2. The construction of $^ 



The formulas (fk(x, yi,k) are composed of subformulas that simulate certain properties 
of A (cf. Lemma 6.7). Let us first establish these subformulas and then assemble 
<Pk(x,yi,k)- Conceptually, we follow the construction of the corresponding formulas in 
BGMZ10 and we borrow several notions from there. However, due to the branching 
inherent in trees, we have to employ sometimes more sophisticated formulas. 

As mentioned we will represent (encode) states of A by positions of the input tree. 
A subtask of ifk(x, yi.k) is to find out, for a position v, the base position of v and the 
state encoded by v. Next we define these concepts, and we start with a list of useful 
macros. 



Identifying Base Nodes and Coded States. Let £ e T s and v G pos(£). Then v 
determines (encodes) uniquely a state q € Q as follows. In fact, there are uniquely 
determined i G N and q £ Q such that |u| — i ■ n + q. We denote by 

• (v) the prefix of v of length i ■ n and call it the base position of v, and by 

• [v] the state q and call it the state encoded by v. 

Note that (v) G pos(£) and [v] € Q. Now we define the macro y — (x) n in BFO+mod 
which allows to identify the base position in the sense that: 

iy = (*u(M={l (6) 

(where x and y are assigned to v and u, resp.) The definition of the macro is 
(y=(x)n)-= f\ ((M =„ q) 4 desc q (y,x)) , 

0<<7<n 

certainly this satisfies 

It is clear from the definition that [v] = \v\ — \{v)\ G Q, i.e., the state represented 
by a position v is the number given by the distance between (v) and v. However, for 
reasons detailed later, we would like the base node (v) to coincide with a cut-position 
of £. But then, due to the branching inherent in £, the state [v] may be represented 
by another node v' satisfying that («') = (v) and [v 1 ] = [v]. We will avoid this by 
forcing the assignment to choose a v which is on the leftmost path from (v) , and this 
leftmost path must have at least length n — 1 (in order to be able to encode each of 
the n states). Thus we define the following macros to identify states: 

• form-path(y ,n) := V_ weN Jorm-pa,th. w (y 0tn ), 
{y ,...,y n form a path) 

• form-lmp(yi, n ) := form-path(y^„) A Vz.(desc„_i(yi, z) 4 (y„ ■< z)) 
(the positions yi, . . . , y n form the leftmost path of length n — 1), 
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• on-lmp n _ 1 (a;,y) := 3j/ l n .(i = yx) A form-Imp^ „) A \/_ 1<i<n {Ui = y) 

(there is a path of length n— \ starting from x, and y is a position of the leftmost 
such path). 



Identifying the Cut-Positions. Due to Observation |6.2[ any position u uniquely de- 
termines the sequence cut„(£, u) of cut-positions. The next subtask of ifk(x, yi,k) is to 
identify this sequence. For this we employ the macro form-cut ni fc(x, yx,k) with fc > 
such that, for every u,Ui, . . . ,Uk G pos(£): 

u , we , / 1 if cut n (£,u) = (ui,...,Uk) and , , 

[form-cut nifc (a!,tf 1| fc)]K,u ) ui,fc) = < Q otherw - se / (7) 

We define 

/ k \ /k-i \ 



form-cut n)fe (a; ! yi ! fc) := if\ desc n (x, yi) A (height (jfc) > n) j A sibl„(y l , y i+ i)J A 

^Vz.(desc„(x, z) A (height (z) > n)) -4 (\/ z = t/,)J , 

where we have used the following auxiliary macros: 

• (height(x) > n) := 3z.desc n (x, z) 

• sibl n (x,j/) := 3x',y'.(sM(x',y') A desc„_i(x', x) A desc„_ 1 (y',y)). 
Note that 

f 1 if 3u G pos(C) Bi,j G N + 3m;, u' 2 g N"" 1 : 
[sibl„(x, y)](wi, u 2 ) = < i < j A ui — uiu[ Aa 2 = 
I otherwise 

Taking the definition of cut„(£, it) into account, it is not difficult to see that our macro 
satisfies tin. In particular, form-cut„ (u) holds if and only if £|„ G . 



Identifying the Head. For every u e pos(£) with cut„(£,u) = (ui, . . . , life), we can 
identify the piece of £ which starts at u and ends in (ui, . . . , it/.), which is head„(£, it). 
More precisely, for every fc G N and £ G C£ fc we define the macro cheeky (x, yi^) such 
that for every £ G Ts, u, tii, . . . , Uk G pos(£): 

[check^,^)]^,^) = { J ^7l!f lk • ■ • [Zk]uk)U (8) 
Hence in case fc = we have 

[check c( *)](£,u) = { J (9) 
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The definition of the macro is as follows: 

check ( (x,yi t k) ■= 

f\ @f • (u — xw ) ^ label cw (y)) A f\ [y t = x pos Zi (0) . 

u)Spos(C)\pos Zfc (C) l<i<k 

In case k — we have 

check c (x) = /\ (By. (y = x w) A label c(ti)) (y)) 

u)Spos(C) 

It is easy to observe that pi) is satisfied. 

Construction of Now we define the family <i>_4 = ((pk(x,y\ t k) | < k < 

max(S,n)) of MSO-formulas where 

M*)~ \/ V (3*.0„c(*»*))Afc(C) 9 (10) 

0<g<ri-l CeCg_ 

with 

6 g ^(x,z) := (z = A form-cut nj o(z) A desc g (z, a;) A check^(z) , 

and for every 1 < k < max(S, n) 

<Pk{x,yi,k) ■= \/ V (^- Z ^ Z l,k-&q,q lik x( X ^yi,k,Z,Z 1 ^)) Ah qi - qh (Q q 

0<qi,...,q k ,q<n-l CGCg k 

(11) 

with 

Qci,qi, k ,d x 'yi,k,z,Zi,k) ■■=(z= (x) n ) A form-cut„, fe (z, Z\,k) A f\ on-lmp„_ 1 (z i , yi)A 

l<i<k 

desc q (z,x)A f\ desc 9l (z 4 , j/i) A check c (z, z lyk ) . 

l<i<k 

Observation 6.5 $_4 is a family of (BFO+mod)-step formulas. 

6.3. Proof of Theorem 16.11 

Now we will prove Theorem |6.1| We split the proof into three steps. In the first 
step we determine the semantics of the formula <pk(%t yx,k)- We prepare this by the 
following technical lemma. 

Lemma 6.6 For every £ 6 Te, < k < max(E,n), v,Vi,...,i>k € pos(£), < 
q\ , . . . , qf. , q < n — 1, and £ € C£ fc we have 
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(M?,Zi,k-8q,qi,H,c)*(v,vi,k) holds <S=> 0^ qqikX (v,v lyk ,(v),(v) 1 ^) holds, 
where (v)i t k abbreviates the sequence (i>i), . . . , (v^). 

Proof The direction holds by definition. To show the direction =>•, assume that 
there are u, u\, . . . , u k G pos(£) such that 6* ^(v, vi.k, u, u\^) holds. Then, in par- 
ticular, we have that 

(a) (u = 

(b) form-cut^ k (u, it^/.), and 

(c) desc^. (ui, Vi) holds for every 1 < i < k. 

Hence u = (v) by (a). By (b), we have form-cut^ u((v), ui.fc). This latter, Condition 
(c), and the fact that < qi < n — 1 for every 1 < i < k imply that Ui — (uj) for every 
1 < i < k. m 

Now we are able to give the semantics of tfik{x, yi,k)- 

Lemma 6.7 For every £ G Tjj, < k < max(£,n), and v, V\, . . . , v k G pos(£), we 
have 



l<Pk(x,Vi,k)j{£,v,v 1>k ) 



hW-M(he&d n (Z,{v)) [v] if form-cut*^ ((v),{v} 1<k ) and 

on-lmpl.! ({vi), hold 
for every 1 < i < k 

otherwise . 



PROOF Case 1: form-cut^ k {{v), (v)i t k) and on-lmp^_ 1 ((w i ), v^) hold for every 1 < i < 

k. Then check|((w), (u)i,fc) holds for ( = head„(£, (v)) (due to Equation (JsJ) ) . More- 
over, 

desc^((w),w) holds iff q = [v] and descjj. Uj) holds iff qi = [vi 



- 1 1 ■ 



6.6 



v k 



Then 0?, fc ^ («, wi^, (u), holds and thus, by Lemma 

(3.S, Zijfc-^M^o]! (w, w i,fc) holds, where [i>]i,fc abbreviates the sequence [v±], . . 
Altogether this means that [^fc(a;,yi,fe)](£,«,Wi,fe) = /i^d-Kl (head„(£, (u)))m- 

Case 2: form-cut:; (w)i,fc) does not hold or on-lmp„_ 1 ((Uj), Vj) does not hold 

for some 1 < i < k. Then for every < qi.k,q < n — 1 and £ G fc , 
the property 9* ^(v, v\ >k , (w), (u)i,fe) does not hold and thus, by Lemma 
@z,Zi,k.0 Mlifc ,£)*(u,?;i,jO does not hold. Hence l<Pk(x,yi,k)j(£,v,v l!k ) = 0. 

Corollary 6.8 For every £ G 7s, < k < max(S,n), v,v\, . . . ,v k G pos(£), if 
(pl(v,vi tk ) ^ 0, then desc^ 2n] k (v, Vi >k ) holds. 



6.6 



Proof It follows from Lemma 



6.7 



because the fact that form-cut^ u{{v), (v)i ik ) holds 



and on-lmp„_ 1 ((ui), Vi) for every 1 < i < k implies that descj^ 2n j fc (f,fi,fe) holds. 
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In the second step, we prove that in the disjunction (on Z) which defines 
[desC[ li 2n]-TC a; ($_4)](^, e) only one member may differ from 0. In the following we 
abbreviate by $ and the expression descn 2 n ] by 2n. 

Lemma 6.9 For every Z e N + , £ e Ts, and v e pos(£), if Z ^ size(dec„(£|(„))), then 
|2n-TC^($)l(e,«) = 0. Hence 

[2n-TC x ($)](£,«) = l2n-TC7 c{dcCnmM)) m(tv). 
Proof We prove the statement by induction on Z. 

I = 1 : By our assumption size(dec n (^b t) \)) > 1. Then [form-cut ri , i o(a ; )](d t>) = and 
thus we have [2ra-TC* ($)](£, v) = 

I =$> I + 1 : Let us assume that I + 1 ^ size(dec„(£|/ w \)) and that for every I' < I 
and v' G pos(C), if Z' 7^ size(dec„(£| („/>)), then [2n-TC£ ($)] (£, u') = 0. We prove by 
contradiction. Therefore, we assume that [2n-TC^ +1 ($)](C,w) 7^ 0. This latter, by 
definition, means that there are k > 1 and vi, . . . ,Vk G pos(£) such that 

(a) [desc [1)2 „] )fc (2:,yi ifc )](£,«,«i ife ) ^0 

(b) [v*(a:,yi,fc)](C,v,vi,fc) ^ 0, and 

(c) [ V A 2n-TCft(3)](f,i; llfc )^0. 

h,...,lk&i+ l<i<k 

h+...+l k =l 



By condition (b) and Lemma 6.7 we obtain that form-cut^ fc((f), (w)i.fc) holds, which 
means that cut„(£, (u)) = ((t>i), • • • , (wfe)) (Equation [7]). Thus 

size(dec„ (C| („))) = 1 + ^size(dec n (^|^ 4 ))). 
i=i 

Moreover, condition (c) means that there are Zi, Z& G N+ with Zi + ... + Z& = Z such 
that, for every 1 < i < k, we have [2n-TC^ ($)](£, v*) 7^ 0. On the other hand, by 
our assumption, there is a 1 < j < k such that lj 7^ size(dec n (£| (,)•)))• For this j, by 
the induction hypothesis, we have [2n-TC^. Uj) = 0, which is a contradiction. 
Hence [2n-TC^ +1 ($)] (£, «) = 0. ■ 

In the third step we prove that in the disjunction (on k) which defines 2n-TC5 c +1 ($) 
only one member may differ from 0. 

Lemma 6.10 Let £ G Is and v G pos(£) with cut„(£, (v)) — (ux, . . . , Uj) for some 
j G N. Then 

[2n-TCt +1 (<I>)J(£,tO = 
lay 1J .desc [1)2n]j -(x,yij)A^(x,2/i, J )A \/ /\ 2n-TC% h ($)](£, «) 

!i,...,ii£N l<i<j 
Zi + ...+(j=Z 

for every Z G N. 
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Proof First we show by contradiction that, for every k e N with k ^ j and 
Vi,...,v k <E pos(£), we have that [tp k (x, v, vi k ) = 0. Assume that there 

are k j) and € pos(£) such that [<Pk(aJ,2/i,fc)](£, v>«i,fc) 7^ 0. Then, by 

we have form-cut^ fe ((i>), (v) lik ) holds, i.e., cut„(£, (v)) = . . , (w fc )) 



Lemma 



6.7 



(by Equation [7]). But this contradicts the fact that the breadth of cut n (£, (v)) is j and 



the uniqueness of the breadth of a cut (cf. Observation 6.2 1. Then we can calculate 
as follows: 

[2n-TCl +1 (<*>)] 

= [ V 3yi,k-desc [1Mhk (x,y hk )Aip k {x,y hk )A V A 2n-TC^ ($)] (£, «) 

0<fc<m i x ; fc GN l<i<fe 

il+...+(fc=i 

£ Ptfi,fc.de8c [1) an],*(a!,yi,fc)Apfc(s,yi > fc)A V A 2n-TC# ($)](£, v) 

0<fc<m h,...,l k EN l<i<k 

h+...+l k =l 

= l3yi tj .de$c [lt2nhj (x,yi tj )Aip k {x,y hj )A \J A 2n-TC{« ($)](£, ») 

ii,...,(jGN l<i<j 
h+...+l } =l 

(since [(^(ir, Wi, *,)](£, v, v ljk ) = for every k ^ j and Vi,...,v k € pos(£) 



by Lemma 6.7| 



This proves the statement. 



Now we are able to prove Theorem 6.1 
Proof of Theorem |6~T) Let £ G Tj> 
Case 1: height(£) < ra. Then 

[2n-TC x ($)](£,£) 



[2n-TCf e(dcc " (Q) ($)l(e,e) (by Lemma Eg) 



= [2n-TCi($)](C,e) (because height < n) 
= bo(^)l(C,e) (by definition of 2rc-TC*($)) 



= M£)o (by Lemma 6.7 and the fact that [e] = 0) 
= r A ($ . 

Case 2: height(£) > n. We consider the following statement: 

Statement 1. For every I > 1 and w € pos(£), if I = size(dec n (^|/„\)) and 
on-lmpl.! ««),») holds, then [2n-TCi ($)](£, «) = *»(£!(„>)[,]. 
If Statement 1 holds, then we obtain 

pn-TC, ($)](£,£) = [2n-TCf c(dcc " (0) ($)](C,£) = ft(0[e] = HOo = r A ®, 

where the first and the second equalities are justified by Lemma |6.9| and Statement 1, 
respectively. 
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Finally, wc prove Statement 1 by induction on I. 
I = 1 : We have 

[2n-TC* ($)](£,«) - [<po^M,v) (by the definition of 2n-TC&(*)) 
= Ht\(v))[v] (by Lemma Kfh . 



I => I + 1 : We assume that 1 + 1 = size(dec n (^|/ t) \)) and that Statement 1 holds for 
every 1 < I' < I. We denote the cut-positions below (v) by m, i.e., cut n (£, (v)) = 
(u\, . . . , Uk) for some k > 1. Then we can calculate as follows. 

[2n-TCi +1 (<i>)](^) 



p2/i, fc .desc [li2 „],fe(a;,yi,fc) h^ k {x,Vx,k) A V A 2 ^-TC^. ($)] (£, v) 

h,....l k eN + l<i<k 

(by Lemma 16.101) h+...+i k =i 



J2 |desc[i i2n ],fe(x, y lt k) A ip k (x, y lyk ) A 

l>l,...,Dfc£pos(£) 



V A _2n-TCj; 4 (*)](£, 



Zi,...,Zfc£N + l<i<fe 



E b*(a,I/i,*)A V A 2n.TC& (*)](£, »,« 1)fc ) 



Ul,...,Ufc£pos(f) 



(by Corollary 6.8 ) 



E 

t>i,...,Ufc£pos(£) 



,ifceN+ l<i<k 

h+...+l k =l 



E 

t)i,...,t>fc£pos({): 
«l = (i'l),---,«fc = <i'fe), 
on-lmp^_ 1 {(vi),Vi) 



(by Lemma 6.7) 



E 

t)i,...,t) fe 6pos(£): 
«l = (i'l),---,«fc = < , "fe), 
on-lmp^_ , ((vi) ,Vi) 



(by Lemma 6.9 ) 



E nj2n-TC^(4>)](^,^) 



;i,---,ifc£N + l<i<fe 

h+...+l k =l 



hM-M(head n (£,{v))) [v] 



n [2n-TCr CdeC " ( " K>)) ($)](e,^) 



Ki<fe 
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E 

vx,...,v k €.pos(£): 

U1 = (V!) ,...,U k = (v k ) , 

on-lmp^_ 1 {{ v i) > v i ) 

(by I.H.) 



ft[«i]-Kl(head n (e, {v})) [v] - n K£\ (vt) ) lvi 

Ki<k 



E 

M£l (»>)[«] 

(by Lemma 6.4 ) 



Ki<k 



The last but one step is justified by the fact that there is a one-to-one correspondence 
between the two index sets. In fact, it is easy to see that, for every 1 < i < k, the set 
{v £ pos(£) | u.i = (v) and on-lmp n _ 1 (iti, v)} has exactly n elements. ■ 



7. From Bounded Transitive Closure to 3V(BMSO ste p) 

In this section let £ be a fragment of BMSO which contains BFO. Moreover, let m £ N, 
and $ = {fk(x, yi, fc) | < k < m) be a family of formulas in £ s tc P - We will construct 
a 3V(£ s te P )-formula ^>(x) which is equivalent to desc+-TC a: ( < i>) in the following sense. 

Theorem 7.1 [desc+-TC a ($)](£, s) = |*(x)](^,e) for every £ € T s . 

Then we obtain the following as a direct corollary. 

Corollary 7.2 Let r : Ts — >• S* be a weighted tree language and C £ {BFO + 
mod, BMSO}. If r is desc + -TC(£ s t G p)-defmable, then r is 3V(£ s t C p)-defmable. 

Clearly, ^(x) should have the form 3X.Vy.9(x,X,y) for some £-step formula 
6(x,X,y). We will define 9(x,X,y) such that it decomposes a set J C pos(£) of 
positions of an input tree £ £ Ts into forks, which are tuples of the form (v, t>i ft), and 
then applies ifk(x,yi t k) to every such fork. First let us define the concept of a fork 
in J. 

7.1. Forks in a Set of Positions 

Throughout this subsection let £ £ Ts be a tree and J C pos(£). 

Now let v € </, fc € N, and t>i j € J. The tuple (w, is a k-fork in J (at v) if 

• descl(t>,i>i) for every 1 < i < and sibl^_(i>i, u-t+i) for every 1 < i < k, 

• for every w £ J with desc+(u, w), we have either t>i = w or desc+(vi, w) for some 
Ki< k. 
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Note that the third condition assures both that (1) there is no position w E J with 
desc^(t>, w) and desc^_(w, v{) for some i, and (2) that a fc-fork is maximal in the sense 
that the list vi t k cannot be extended by a further position in J such that we get an 
(k + l)-fork in J. 

For instance, if £ = S(a, f3, <r(a, a)) and J = {e, 2, 3, 31}, then (e, 2, 3) is a 2-fork of 
J at e, and (3, 31) is a 1-fork of J at 3. However, neither (e, 2) nor (e, 3) is a 1-fork of 
J at e (because they can be extended to (e, 2,3)). Moreover, (e,2,31) is not a 2-fork 
of J at e. 

Observation 7.3 For every v E J there is exactly one k E N and exactly one sequence 
^l.fe G pos(£) such that (v,i>i.fc) forms a fc-fork in J. 

We call the unique k the branching degree of (v, fi,fe). The branching degree of J is 
the number 

bd(J) = maxjfc E N | is a fc-fork at v for some v E J} . 

Observation 7.4 Let bd( J) < m. Then, for every v E J there is exactly one < k < 
m and exactly one sequence Vi t k G P os (C) such that (v,Vi t k) forms a fc-fork in J. 

We are going to describe the fact that certain positions of a tree form a fc-fork by 
the macro 

/ fe \ /fe-i \ 



fork fc pf,2/,zi !fc ) := y/\ z i € ^ A desc+(y, Zi)J A |^ /\ sibl+(zi, z i+ i) ) A 

^Vz.(z elA dcsc + (y, z)) ^> \/ desc(zj, z) 
In fact, for every v E J, and vi.fc € pos(£), we have that 

fork|( J, v, vi t k) holds if, and only if (v, t>i,fc) forms a fc-fork in J at v. 
Thus, due to Observation |7.4[ we can observe the following. 



Observation 7.5 Let bd( J) < m. Then, for every v E J there is exactly one < fc < 
m and exactly one sequence Vx k € pos(£) such that fork|( J, v, Ui,/s) holds. 

Next let w E J. We will consider those forks in J, of which the top position is equal 
to or below w and the branching degree is at most m. Hence, we define 

Forks m (J, w) = {(v,vi t k) \ < k < m, v,vi,k E J, desc^(w,v), and fork|( J, v, vi.k)}- 

Let us introduce the macro 

desc(x, Y) :— Vy. \{y eF)4 desc(a;, y)] . 

If bd( J) < m and J has the form of a cone, i.e., there is a w E J such that desc^(w, J) 
holds, then J and Forks m (J, w) are bijective. 
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Observation 7.6 Let bd( J) < m and desc^(w, J) for some w € J. Then the mapping 

H : J -> Forks m (J, to), w i-> (v, ui.fc) 
is a bijection, where fork|(J, u, Ui,fc). 

We can split off the topmost fork from Forks m ( J, to) as follows. 
Observation 7.7 Let bd(J) < m and desc^(w, J) for some w € J. Then 

fe 

Forks m (J, w) = {{w,v hk )} U |J Forks m (J,v 4 ) . 

i=i 

7.2. Construction of ^(x) 

Now consider the formula 
0(x,X,j/) := 

(m 
y € X 4 desc(a;,y) A \/ 3z ljfc . [fork fe (X, y, z 1M ) A <£ fe (y, z 1>k )] 
k=0 

and let 

*(a:) :=3X.Vy.0(x,X,y). 

Lemma 7.8 ^(x) is equivalent to a 3V(£ s t op )-formula. 

Proof We show that the formula 8(x, X, y) is equivalent to an £-step formula. Since 

(p ip :— -itp V (<p A -0), it suffices to show that 3zi t k- [forkfc(JC, y, 2i,fc) A y>fe(y, zi,*)] is 
an £-step formula. By Lemma |2.3| we have 

<Pk{y,z ltk )= \f a ik Ai>i k (y,z lik ) 

for some finite set Ik, ai k € if, and formulas ipi k {y, £i,fc) in £. Then 
3^i,fe.[fork fe (X, y, A <y9 fe (y, z ljfc )] = 3z life . [fork fc (X, y, z ljfe ) A ( \J a ik Aip ik )] = 

V Afork fe (X,y,zi,fe) A^i*] = V a i fc A 3^ fe . [fork fc (X, y, A V^] , 
He/*, ikdk 

where in the last step we use Observation |7.5| The last formula is an £-step formula 
because (i) forkfc(JT, y, z± t k) is in BFO and (ii) 3zx,k- [fork/s (X, y, Ai/»i fc ] is a formula 
in C because we have assumed that BFO C C. m 
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7.3. Branching Degree of J 



The formula ^{x) quantifies over an arbitrary subset J C pos(£). On the other hand, 
the branching degree of the forks is bounded by m. Due to the form of we can bind 
also the branching degree of J to m. 

Observation 7.9 Let J C pos(£) and w £ pos(£). Then 

[4y.9(x,X,y)l(£,w,J) ^ implies bd(J) < m. 

Proof We prove by contradiction. Assume that {0(x, X, w, J, v) ^ for every 
v £ pos(£) and that there is a fork (v' , v±j) £ Forks m (J, w) with I > m. In particular, 
\0(x,X,y)\(Z,,w,J,v') 7^ 0. Since w,v' £ J and desc^(w,v') holds, we have 

m 

l6(x ) X,yM,w,J,v')=Y, E torkl(J,v',w ltk )-l<p k (y )Zl , k m,v',w hk )=0. 

k=0 u)i,fcepos(5) 

The last equality follows from the fact that forkf( J, v', v-ij) holds, hence 
fork^(J, w',it>i,fc) = for every < k < m and Wi jk £ pos(£). This is a contradic- 
tion. ■ 

7.4. Proof of Theorem 17.11 

PROOF of Theorem |7.1| Let £ G T^. Then we have 



JCpos(C) 



E IVy.0(z,X,y)K£, £ ,J) (by Obs. [7J) 

JCpos(£),bd(J)<m 

e n [e(x,x,y)](e, e ,j,i;) 

JCpos(£),bd(J)<m uSpos(£) 



E UM^X, y m,e,J,v) 

JCpos(£),bd(J)<m uGJ 

(due to the form of 9) 

E II [0(x,X,y)](£,e,J,v) 

JCpos(£),bd(J)<m {v,Vi k ) £Forks m ( J,e) 



(by Obs. 7.61 



e n 

JCpos(£),bd(J)<m (ii,-ui. fc >eForks m (J,e) 



31 



= e e n 

1>1 JCpos(£),bd(J)<m («,i>i, fc )GForks m (J,e) 
|Forks m (J,e)|=; 

=* E [desc + -TC^(a>)](^e) = [desc+.TC* ($)](£, e) . 

i>i 

At equation marked by * we have used the following statement (for w — e): for every 
I > 1 and w € pos(£) we have 

[desc + -TC£(S)](£,u;) = J2 II • (12) 

JCpos(£),bd(J)<m (u,t)i ifc > SForks TO ( J,tu) 

t«eJ 

desc^ (w,J) 
|Forks,„(J,Kj)|=Z 



We will prove ( 12 ) by induction on I. 

I = 1: [desc+ /TO* ($)](£, w) = l<po(x)}(g,w). For the evaluation of the right hand 
side of ( p~2[ ) we observe that J = {w} and Forks TO (J, w) — {(w,e)}. Hence the right 
hand side also reduces to l<po(x)](£,w). 

i+i. 



Idesc + -TC^ 1 ($)](e 



_I.H. 



E lMx,yi,k)M,w,v hk ) ■ E I! [desc+.TG^ 

l<fc<m ;i,...,i fe £N: \<i<k 

"i,itepos($) (i-i-...+i),=i 

sibl_|_ j jVj+i ) 

E l<Pk{x,yi,k)W,w,v lih ) ■ 

1 < fc < m 
vi ifc Gpos(£) 
sibl+ (^i 
desc^tUj-Ui^) 

e n ( e n ivuM^Mm) 

h,...,h l<i<k \ J i Cpos({),bd(J i )<m ih . ) SForks m (J t ,«,) . 

Ji + ---+ifc=( n;6Ji 

dcsc^ (vi,Ji) 
\Forks m (J i ,v i )\=l i 



e e mc^^mi n mc^a,**) 

l<fe<m JCpos({),bd(,/)<m i=l < 9i . ) eForks m ( J.Vi ) 

me J 

desc^ {w,J) 
|Forks m (J,ui)|=Z+l 

(where the positions tin are determined by fork|(J, w, Vi note that the 
Ui's are uniquely determined due to Obs. 7.5) 
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E I! l<Pkm,v,v ltk ) (byObs.[7j]) 

JCpos(£),bd(J)<m (v,v lik )£Forks m (J,w) 
w<E,J 
desc (u>, J) 
|Forks m (J,u>)|=2 + l 



8. Conclusion and Open Problems 

We have introduced the branching transitive closure operator on weighted MSO- 
formulas and have proved that, for trees and commutative semirings, the application of 
this operator to step-formulas of Boolean- valued weighted MSO-formulas characterizes 
MSO. 

We mention three open problems. As mentioned in the Introduction, for string 
languages the expressive power of MSO and FO + TC^ 1 ' are the same |BM92| . where 
TC [1] denotes unary (unrestricted) transitive closure. However, it was shown recently 
in |tCS08j that for trees MSO is more powerful than FO + TC' 1 '. We can give an alter- 
native definition of (unrestricted) transitive closure for trees by dropping the progress 
formula ip, and hence the restriction imposed by it, from the definition of the branch- 
ing transitive closure. What is the relation between the expressive power of MSO and 
the extension of FO with such unrestricted transitive closure for trees? 

For every m E N, progress formula ip, and C C MSO, we can define the class 
ip-TC(£,m) of formulas of the form ip-TG x ($), where $ = (ip k (x,yi. k ) | < k < m) 
is a family of ^-formulas. Thus, the number of the members of the family is bounded 
by a fixed value m. Then the question is whether there are concrete instances of ip 
and C (e.g. ip =< and C = BMSO s tep) such that the classes of ip-TC(£, m)-definable 
weighted tree languages form a strict inclusion hierarchy for m = 0, 1, 

We can define a restricted transitive closure by replacing the condition "ip^(v,v') 
implies desc^(w,w')" by u tfj^(v,v') implies (v -< v')^" in the definition of the progress 
formula. The progress formulas desc + and -< work on monadic trees as the numerical 
relation < of BGMZ10 on strings, hence both desc + and -< can be viewed as a natural 
generalization of the progress formula < on strings. In Theorem |6.1| we have proved 
that the two logics desc + -TC(BMSO s t ep ) and RMSO are equivalent. Moreover, it is 
easy to see that ~< -TC(BMSO s t ep ) is at least as powerful as desc + -TC(BMSO ste p), 
and hence as RMSO. Is it more powerful? 
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A. Collection of the Used Macros 

For the convenience of the reader we list here all the macros which are used in this 
paper. 

For every ip,ip g MSO: 

• if -4 ip := -up V (ip A ip) 
For every ip, ip g BMSO: 

• ip\/ip := -i(-np A -tip) 

• Bx.tp := -iVx.-i<p 

• 3X.ip := ->\/X.-xp 

• ip ^> ip := ^<^V((^ A ip) 

Next we proceed in alphabetic order. 

• (V = xw) := 3y ,n-(x = Vo) A form-path tu (y 0j „) A (y n = y) 

• (\x\ =„ m) := 

VA. (((a: e X) A (Vy.((y G X) A (\y\ > n)) 4 (y/n e A))) 4 (m e |A|)) 

• (\y\> n ) :=3x.desc n+1 (x,y), 

• (y/n g A) := 3x. (x g A) A desc„(x, y), and 
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(m G \X\) := 3x, y. root(x) A desc m (x, y) A (y G X) 

(y = (x) n ) ■= Aa< q<n ((M =n O) ^ descry, x)) 

check c (x,yi ifc ) := A (3.V- (y = xw) A label C ( u) )(y))A 

tuGpos(C)\pos Zfc (C) 

A {Vi = x vos Zi (0) 

Ki<k 



desc(x, y) := (x ^ y) A ^3z, z'.z" 



(z' X a;) A (x X z") A (z" ^ y) 



desc(x, y) := Vy. [(y eF)4 desc(x, y)] 
desc + (x, y) := desc(x, y) A -■(x = y). 
dcs C[iJ] (x, y) := \/_ weI (y = xw) 

where i, j e N, [i, j] = {fc € N | i < k < j}, and I = \J ke[i ^ N fc 
descj(x, y) := desc^^x, y) 
edge(x,y) := V^^^^edge^x, y) 

fork fc (X,j/,^i ife ) := ^Ai=i z i € ^ Adcsc + (y,z;)) A (Ati sibl+(z t , z i+ i)) A 

(Vz.(z elA dcsc+(y, z)) 4 V^desc^, z)j 

form-cut„ ife (a;,yi >fc ) := (Ai=i desc„(x, y») A (height (j/i) > n)j A 

(Ai=i lsibl n(2/i.2/i+i)) A (Vz.(desc„(x, z) A (height(z) > n)) 4 (V° =1 Z = 
form-path^ (y .n) := A edge^.^-i,^) 

l<i<n 

form-path(y ,„) := V cM „ form -P ath ™(2/o,n), 



form-lmp(yi i „) := form-path(yi in ) A Vz.(desc„_i(yi, z) -4 (y n ^ z)) 
(height(x) > n) := 3z.desc„(x, z) 

on-lmp^^y) := 3y Ml .(x = y x ) A form-Imp (yi >n ) A V 1<i<n (j/« = 2/) 
root(x) := Vy.^ edge(y, x) 

sibl(x,y) :=3z.y i < i< .< maxrk(E) edge i (z,x)Aedge J .(z,y) ^ 

sibl+(x, y) := 3x' , y'.(sibl(x', y') A desc(x', x) A desc(y', y)) 
sibl n (x, y) := 3x' , y' . (sibl(x', y') A desc„_i(x', x) A desc„_i(y', y)) 
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